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in  a  Cyber-Attack  Scenario 
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ABSTRACT 

In  a  corporate  network,  the  situation  awareness  (SA)  of  a  security >  analyst  is  of particular  interest.  The 
current  work  describes  a  cognitive  Instance-Based  Learning  (IBL)  model  of  an  analyst’s  recognition 
and  comprehension  processes  in  a  cyber-attack  scenario.  The  IBL  model  first  recognizes  network  events 
based  upon  events  ’situation  attributes  and  their  similarity  to  past  experiences  (instances)  stored  in  the 
model’s  memory.  Then,  the  model  comprehends  a  sequence  of  observed  events  as  being  a  cyber-attack 
or  not,  based  upon  instances  retrieved  from  its  memory,  similarity >  mechanism  used,  and  the  model ’s 
risk-tolerance.  The  execution  of  the  model  generates  predictions  about  the  recognition  and  comprehen¬ 
sion  processes  of  an  analyst  in  a  cyber-attack.  A  security >  analyst’s  decisions  in  the  model  are  evaluated 
based  upon  two  cyber-SA  metrics  of  accuracy  and  timeliness.  The  chapter  highlights  the  potential  of 
this  research  for  design  of  training  and  decision  support  tools  for  security >  analysts. 


INTRODUCTION 

With  the  prevalence  of  WikiLeaks  hacks  and  other 
threats  to  corporate  and  national  cybersecurity, 
guarding  against  cyber-attacks  today  is  becoming 
a  significant  part  of  IT  governance,  especially 
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because  most  government  agencies  have  moved 
to  online  systems  (Sideman,  2011).  In  order  to 
protect  national  cybersecurity,  leaders  from  the 
Defense  Department,  NATO,  and  the  European 
Union  assembled  in  Brussels  recently  to  discuss 
a  plan  to  prevent,  detect,  defend,  and  recover 
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from  cyber-attacks  (Sideman,  2011).  The  leaders 
there  agreed  that  existing  cybersecurity  measures 
were  incomplete  and  decided  to  fast-track  a  new 
plan  for  cyber-incident  response.  Similarly,  the 
Department  of  Homeland  Security  (DHS)  has 
recently  launched  a  national  campaign  called, 
“Stop|Think|Connect,”  aiming  to  cultivate  a  col¬ 
lective  sense  of  cyber-civic  duty  among  personnel 
in  organizations  and  enterprises  that  help  preserve 
cybersecurity  (Lute  &  McConnell,  2011).  The 
DHS  ’  message  begins  with  the  following  wisdom: 

Senior  management  in  each  and  every  office, 
company  and  department,  whether  private  or 
public,  must  take  responsibility for  the  protection 
of  its  own  systems  and  information,  by  fielding 
up-to-date  security  technology,  training  employees 
to  avoid  common  vulnerabilities,  and  reporting 
cybercrime  when  it  occurs.  (Lute  &  McConnell, 

2011,  p.  1) 

As  80%-90%  of  what  individuals  and  the  gov¬ 
ernment  do  using  the  Internet  today  depend  upon 
private  corporate  networks  provided  by  organiza¬ 
tions  and  enterprises  (Sideman,  2011),  according 
to  DHS,  corporate  networks  that  ensure  our  cy¬ 
bersecurity  have  much  bigger  responsibilities  than 
previously  thought  (Lute  &  McConnell,  2011). 
Thus,  meeting  the  DHS’  objectives  in  a  corporate 
network  requires  cyber  situation-awareness  (SA), 
a  three  stage  process  which  includes  recognition 
(or  the  awareness  of  the  current  situation  in  the 
network);  comprehension  (or  the  awareness  of 
malicious  behavior  in  the  current  situation  in  the 
network);  and  projection  (assessment  of  possible 
future  courses  of  action  resulting  from  the  current 
situation  in  the  network)  (Endsley,  1995;  Tadda, 
Salerno,  Boulware,  Hinman,  &  Gorton,  2006). 

The  ability  of  a  corporate  network  to  protect 
itself  from  a  cyber-attack  using  cyber-tools  and 
algorithms  without  any  interventions  from  human 
decision-makers  is  still  a  distant  goal  (Jajodia, 
Liu,  Swarup,  &  Wang,  2010).  Thus,  the  role  of 


human  decision-makers  in  security  systems  is  one 
that  is  crucial  and  indispensible  (Gardner,  1987; 
Johnson-Laird,  2006). 

In  the  absence  of  perfect  cyber-SA  tools  to 
recognize,  comprehend,  and  project  about  cyber¬ 
attacks  (PSU,  2011),  a  key  role  in  the  cyberse¬ 
curity  process  is  that  of  a  security  analyst.  The 
security  analyst  is  a  human  decision-maker  who 
is  in  charge  of  protecting  the  online  operations  of 
a  corporate  network  (e.g.,  an  online  retail  com¬ 
pany  with  an  external  Webserver  and  an  internal 
fileserver)  from  threats  of  random  or  organized 
cyber-attacks.  However,  very  little  is  currently 
known  about  the  role  of  the  cognitive  processes  of 
the  security  analyst  (like  memory,  risk-tolerance, 
similarity  etc.)  that  might  influence  the  cyber-SA 
of  the  analyst  and  his  ability  to  detect  cyber-attacks 
in  corporate  networks  under  different  scenarios 
(Jajodia  et  al.,  2010;  PSU,  2011).  Also,  currently 
there  seems  to  be  a  big  gap  between  how  security 
analysts  function  in  the  real  world  according  to 
their  cognitive  processes  and  how  cyber-SA  tools 
and  algorithms  function  that  intend  to  replace 
human  analysts,  sometime  in  the  future  (Jajodia 
et  al.,  2010;  PSU,  2011).  Due  to  these  reasons,  it 
becomes  important  to  investigate  the  influence 
of  cognitive  processes  of  a  security  analyst  on 
his  cyber-SA  in  popular  cyber-attack  scenarios. 

Past  literature  shows  there  has  only  been  one 
known  cognitive  attempt,  through  an  expert  sys¬ 
tem  called  R-CAST,  to  understand  the  cognitive 
decision-making  aspects  about  a  security  analyst’s 
cyber-SA  (Fan  &  Yen,  2007;  Jajodia  et  al.,  2010). 
The  R-CAST  is  a  team-oriented  cognitive-agent 
architecture  that  is  a  computational  implementa¬ 
tion  of  Klien’s  Recognition-Primed  Decision 
(RPD)  model  (Klien,  1989).  R-CAST,  being  a 
computational  implementation  of  RPD,  is  a  rule- 
based  system  which  requires  a  priori  knowledge 
base  about  cyber-attacks  in  a  scenario  in  which 
it  makes  decisions  (Fan  &  Yen,  2007).  The  a 
priori  knowledge  base  is  used  during  a  mental 
simulation  in  the  RPD.  In  the  mental  simulation, 
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the  R-CAST  applies  rules  that  are  constrained  by 
the  cyber-attack  scenario  in  which  the  R-CAST 
operates  to  determine  the  future  courses  of  ac¬ 
tion  (Fan  &  Yen,  2007).  The  cognitive  approach 
taken  in  this  chapter  (more  details  below)  does 
not  incorporate  dependencies  about  an  existing 
knowledge-base  and  future  courses  of  action  as 
assumed  in  the  R-CAST. 

The  main  purpose  of  this  chapter  is  to  describe 
a  cognitive  model  of  the  recognition  and  compre- 
hensionprocesses  in  a  security  analyst’s  cyber-SA. 
The  model  is  based  on  Instance-Based  Learning 
Theory  (IBLT;  Gonzalez,  Lerch,  &  Lebiere,  2003). 
Furthermore,  we  evaluate  the  performance  of 
the  IBL  model  of  the  security  analyst  using  two 
cyber-SA  measures:  accuracy  and  timeliness 
(Jajodia  et  al.,  2010)  on  a  popular  simple  cyber¬ 
attack  scenario  about  an  island-hopping  attack 
(Ou,  Boyer,  &  McQueen,  2006;  Xie,  Li,  Ou,  Liu, 
&  Levy,  2010).  IBLT  is  well  suited  to  modeling 
the  security  analyst’s  decisions  as  the  theory 
provides  a  generic  decision-making  process  that 
starts  by  recognizing  and  generating  experiences 
through  interaction  with  a  changing  decision  en¬ 
vironment,  and  closes  with  the  reinforcement  of 
experiences  that  led  to  good  decision  outcomes 
through  feedback  from  the  decision  environment. 
Unlike  the  R-CAST,  the  IBLT  neither  assumes  a 
rule-based  cognitive  process  nor  needs  an  existing 
knowledge-base  to  choose  future  courses  of  action 
and  make  decisions;  rather,  experiences  in  IBLT 
are  generated  overtime  as  a  result  of  interaction 
of  an  IBL  model  with  its  decision  environment 
(e.g.,  a  cyber-attack  scenario). 

In  the  next  section,  we  describe  a  popular 
cyber-attack  scenario  of  an  island-hopping  attack 
in  a  corporate  network.  Then,  we  describe  a  model 
based  upon  IBLT  that  is  used  to  make  predictions 
about  the  cyber-SA  of  a  security  analyst  in  the 
scenario.  Finally,  we  discuss  the  predictions  from 
the  IBL  model  and  explain  the  implication  of  the 
model’s  predictions  when  designing  training  and 
decision  support  tools  for  security  analysts. 


A  SIMPLE  SCENARIO  OF 
A  CYBER  ATTACK 

The  cyber-infrastructure  in  a  corporate  network 
typically  consists  of  a  Webserver  and  a  fileserver 
(Ou  et  al. ,  2006;  Xie  et  al.,  20 1 0)  that  are  protected 
by  two  firewalls  in  the  Demilitarized  zone  (or 
DMZ)  (where  the  DMZ  separates  the  external 
network  (“Internet”)  from  the  company’s  internal 
LAN  network).  The  Webserver  handles  customer 
interactions  on  a  company’s  webpage.  The  fi¬ 
leserver  is  a  repository  for  many  workstations 
that  are  internal  to  the  company  and  that  allow 
company  employees  to  do  their  daily  operations. 
These  operations  are  made  possible  by  enabling 
workstations  to  mount  executable  binaries  from 
the  fileserver.  An  external  firewall  (‘firewall  1  ’  in 
Figure  1)  controls  the  traffic  between  the  Internet 
and  the  DMZ.  The  firewall  1  ’s  rules  are  configured 
to  allow  a  bidirectional  flow  of  the  incoming  “re¬ 
quest”  traffic  and  the  outgoing  “response”  traffic 
between  the  Internet  and  company’s  Webserver. 
Generally,  an  attacker  is  identified  as  a  computer 
on  the  Internet  and  thus  firewall  1  protects  the  path 
between  the  attacker’s  computer  on  the  Internet 
and  the  company’s  website  hosted  by  the  Web¬ 
server.  Another  firewall  (‘firewall  2’  in  Figure  1) 
controls  the  flow  of  traffic  between  the  Webserver 
and  the  fileserver  (i.e.,  company’s  internal  LAN 
network).  Firewall2  allows  aNetworkFile  System 
(NFS)  protocol  access  between  the  fileserver  and 
Webserver.  For  this  cyber-infrastructure,  most  at¬ 
tackers  follow  a  sequence  of  an  “island-hopping” 
attack  (Jajodia  et  al.,  2010;  pp.  30),  where  the 
Webserver  is  compromised  first,  and  then  the 
Webserver  is  used  to  originate  attacks  on  the  file- 
server  (through  venerability  in  the  NFS  protocol) 
and  other  company  workstations  (by  mounting 
executable  binaries  from  the  fileserver). 

The  security  analyst  is  in  charge  of  overseeing 
the  cyber-infrastructure  of  the  company  (consist¬ 
ing  on  the  two  firewalls,  DMZ,  Webserver,  file- 
server,  and  workstations)  from  cyber-attacks 
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Figure  l.A  simple  scenario  of  a  cyber-attack.  The 
attacker  using  a  computer  on  the  Internet  tries  to 
gain  access  of  a  company  s  fileserver  indirectly 
through  the  company  s  Webserver.  Source:  Xie  et 
al.  (2010). 
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originating  from  computers  on  the  Internet.  Ou 
et  al.  (2006)  and  Xie  et  al.  (2010)  defined  a 
simple  scenario  of  an  island-hopping  cyber-attack 
within  this  cyber-infrastructure.  In  the  simple 
scenario,  a  security  analyst  is  exposed  to  a  se¬ 
quence  of  25  network  events  (consisting  of  both 
threat  and  non-threat  events),  whose  nature  (threat 
or  non-threat)  is  not  precisely  known  to  a  secu¬ 
rity  analyst.  Out  of  the  total  of  25  events,  there 
are  8  predefined  threat  events  in  the  sequence  that 
are  initiated  by  an  attacker.  The  attacker,  through 
some  of  these  8  events,  first  compromises  the 
Webserver  by  remotely  exploiting  vulnerability 
on  the  Webserver  and  getting  local  access  to  the 
Webserver.  If  the  cyber-attack  remains  unde¬ 
tected  by  the  8lh  event,  then  the  attacker  gains  full 
access  to  the  Webserver.  Since  typically  in  a  cor¬ 
porate  network  and  in  the  simple  scenario,  a 
Webserver  is  allowed  to  access  the  fileserver 
through  only  a  NFS  event,  the  attacker  then 
modifies  data  on  the  fileserver  through  the  vulner¬ 
ability  in  the  NFS  event.  If  the  cyber-attack  remains 
undetected  by  the  security  analyst  by  the  11th 
event,  then  the  attacker  gains  full  access  of  the 


fileserver.  Once  the  attacker  gets  access  to 
modify  files  on  the  fileserver,  he  then  installs  a 
Trojan-horse  program  (i.e.,  a  malicious  code)  in 
the  executable  binaries  on  fileserver  that  is  then 
downloaded  and  used  by  different  workstations 
(event  19th  out  of  25).  The  attacker  can  now  wait 
for  an  innocent  user  on  workstation  to  execute 
the  Trojan-horse  program  and  obtain  control  on 
the  machine  (event  21st  out  of  25). 

During  the  course  of  this  simple  scenario,  a 
security  analyst  is  able  to  observe  all  25  events 
corresponding  to  file  executions  and  the  packets 
of  information  transmitted  on  and  between  the 
Webserver,  fileserver,  and  different  workstations. 
He  is  also  able  to  observe  alerts  that  correspond 
to  some  network  events  using  an  intrusion- 
detection  system  (IDS)  (Jajodia  et  al.,  2010).  The 
IDS  raises  an  alert  for  suspicious  file  executions 
or  suspicious  packet  transmission  events  that  is 
generated  on  the  corporate  network.  Among  the 
alerts  generated  by  the  IDS  here,  there  is  both  a 
false-positive  and  a  false-negative  alert,  and  one 
alert  that  correspond  to  the  8th  event  but  is  received 
by  the  analyst  after  the  1 3th  event  in  the  sequence 
(i.e.,  a  time-delayed  alert).  Most  importantly,  due 
to  the  absence  of  a  precise  alert  corresponding  to 
a  potential  threat  event,  the  analyst  does  not  have 
precise  information  on  whether  a  network  event 
and  its  corresponding  alert  (from  the  IDS)  are 
initiated  by  an  attacker  or  by  an  innocent  company 
employee.  Even  through  the  analyst  lacks  this 
precise  information,  he  needs  to  decide,  as  early 
as  possible  and  most  accurately,  whether  the  se¬ 
quence  of  events  in  the  simple  scenario  constitutes 
a  cyber-attack.  The  earliest  possible  or  proportion 
of  timeliness  is  determined  by  subtracting  the 
percentage  of  events  seen  by  the  analyst  before 
he  makes  a  decision  to  the  total  number  of  events 
(25)  in  the  scenario  from  100%.  The  accuracy  of 
the  analyst  is  determined  by  whether  the  analyst’s 
decision  was  to  ignore  the  sequence  of  events,  or 
declare  a  cyber-attack  based  upon  the  sequence 
of  observed  network  events. 
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BACKGROUND 

We  believe  that  a  security  analyst’s  accurate  and 
timely  classification  of  a  sequence  of  network 
events  as  a  cyber-attack  or  not  (or  analyst’s  cyber- 
SA)  is  based  upon  the  following  three  factors: 

1 .  The  knowledge  level  of  the  analyst  in  terms 
of  the  mix  of  threat  and  non-threat  experi¬ 
ences  stored  in  analyst’s  memory. 

2.  The  analyst’s  risk-tolerance  level,  i.e.,  the 
willingness  of  an  analyst  to  classify  a  se¬ 
quence  of  events  as  a  cyber-attack. 

3.  The  analyst’s  similarity  model,  i.e.,  the 
process  that  the  analyst  uses  to  compare 
network  events  with  prior  experiences  that 
are  stored  in  his  memory. 

Prior  literature  has  shown  that  the  cyber-  SA  of 
a  security  analyst  is  a  function  of  a  priori  experi¬ 
ences  in  an  analyst’s  memory  about  a  cyber-attack 
scenario  (Jajodia  et  al.,  2010)  and  the  analyst’s 
risk-tolerance  (McCumber,  2004;  Salter,  Sayd- 
jari,  Schneier,  &  Wallner,  1998).  Similarly,  Dutt, 
Ahn,  &  Gonzalez,  (201 1)  and  Dutt  &  Gonzalez, 
(2011)  have  provided  a  priori  predictions  about 
the  cyber- S A  of  a  simulated  analyst  in  an  IBL 
model  and  demonstrated  that  these  predictions 
are  influenced  by  the  experiences  in  memory  of 
a  simulated  analyst  and  the  risk-tolerance  of  the 
simulated  analyst. 

Recent  research  in  judgment  and  decisionmak¬ 
ing  (JDM)  has  also  discussed  how  our  experiences 
of  events  in  the  environment  shape  our  decision 
choices  (Hertwig,  Barron,  Weber,  &  Erev,  2004; 
Lejarraga,  Dutt,  &  Gonzalez,  2011).  Typically, 
having  a  greater  number  of  bad  experiences  in 
memory  about  an  activity  (e.g.,  a  cyber-attack) 
makes  a  decision-maker  (e.g.,  analyst)  avoid  the 
activity;  whereas,  good  experiences  with  an  ac¬ 
tivity  boost  the  likelihood  a  decision-maker  will 


underestimate  the  same  activity  (Hertwig  et  ah, 
2004;  Lejarraga  et  ah,  in  press). 

Similarly,  past  research  has  found  the  role  of 
similarity  to  be  critical  in  problem  solving,  judg¬ 
ment,  decision  making,  categorization,  and  cog¬ 
nition  (Goldstone,  Day,  &  Son,  2010;  Vosniadou 
&  Ortony,  1989).  Essentially,  two  potential  and 
competing  models  of  human  similarity  j  udgments 
have  been  proposed.  These  models  include  the 
geometric  model  (Shepard,  1962a,  1962b)  and 
the  feature-based  model  (Tversky,  1977).  In  the 
geometric  model,  similarity  between  a  pair  of 
obj  ects  (e.g. ,  a  situation  event  in  decision  environ¬ 
ment  and  an  experience  in  memory)  is  taken  to 
be  inversely  related  to  the  distance  between  two 
objects’ points  in  the  space.  The  distance  could  be 
either  a  linear  difference  (linear-geometric)  or  a 
squared  difference  (squared-geometric)  between 
two  objects’ points  in  the  space  (Shepard,  1962a, 
1962b).  In  contrast,  the  feature-based  similar¬ 
ity  model  characterizes  similarity  in  terms  of  a 
feature-matching  process  based  on  weighting 
common  and  distinctive  features  between  a  pair 
of  objects  (Tversky,  1977). 

Although  there  is  literature  that  discusses  the 
role  of  prior  experiences  of  threats  in  general 
and  the  relevance  of  risk-tolerance  in  network 
security  (Jajodia  et  al.,  2010;  McCumber,  2004; 
Salter,  et  al.,  1998),  it  is  difficult  to  find  research 
that  empirically  investigates  the  role  of  both  these 
factors  together  on  a  security  analyst’s  cyber-SA. 
Similarly,  although  there  is  research  that  applies 
both  models  of  similarity  to  human  judgments  in 
general  (Goldstone,  Day,  &  Son,  20 1 0),  research 
is  needed  that  evaluates  the  effects  of  similarity 
models  on  the  cyber-SA  of  a  security  analyst  in 
cyber-attack  scenarios. 

The  above  three  factors,  as  well  as  many  other 
cognitive  factors  that  may  limit  on  enhance  the 
cyber-SA  of  an  analyst,  can  be  studied  through 
computational  cognitive  modeling.  In  this  chapter, 
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we  use  IBLT  to  develop  a  model  of  the  security 
analyst,  and  we  assess  the  effects  of  the  three 
factors  (analyst’s  knowledge  level,  risk-tolerance, 
and  similarity  model)  on  the  accuracy  and  timeli¬ 
ness  of  the  analyst  to  detect  a  cyber-attack  in  the 
simple  scenario. 

INSTANCE-BASED  LEARNING 
THEORY  AND  IBL  MODEL  OF 
THE  SECURITY  ANALYST 

IBLT  is  a  theory  of  how  people  make  decisions 
from  experience  in  dynamic  environments  (Gon¬ 
zalez  et  al.,  2003).  In  the  past,  computational 
models  based  on  IBLT  have  proven  to  be  able  to 
generate  a  priori  predictions  of  human  behavior 
in  many  dynamic  decision  making  situations  like 
and  including  those  faced  by  the  security  analyst 
(Dutt,  Ahn,  &  Gonzalez,  2011;  Dutt,  Cassenti,  & 
Gonzalez,  20 1 0;  Dutt  &  Gonzalez,  2011;  Gonzalez 
&  Dutt,  2010). 

IBLT  proposes  that  people  represent  every  deci¬ 
sion  making  situation  as  instances  that  are  stored 
in  memory.  For  each  decision-making  situation, 
an  instance  is  retrieved  from  memory  and  reused 
depending  on  the  similarity  of  the  current  situa¬ 
tion’s  attributes  to  the  attributes  of  instances  stored 
in  memory.  An  instance  in  IBLT  is  composed  of 
three  parts:  situation  (S)  (the  knowledge  of  situ¬ 
ation  attributes  in  a  situation  event),  decision  (D) 
(the  course  of  action  to  take  for  a  situation  event), 
and  utility  (U)  (i.e.,  a  measure  of  the  goodness  of 
a  decision  made  for  a  situation  event). 

In  the  case  of  the  decision  situations  faced  by 
the  security  analyst,  these  attributes  are  those  that 
characterize  potential  threat  events  in  a  corporate 
network  and  that  needs  to  be  investigated  con¬ 
tinuously  by  the  analyst.  The  situation  attributes 
that  characterize  potential  threat  events  in  the 
simple  scenario  are  the  IP  address  of  the  location 
(Webserver,  fileserver,  or  workstation)  where  the 
event  took  place,  the  directory  location  in  which 
the  event  took  place,  whether  the  IDS  raised  an 


alert  corresponding  to  the  event,  and  whether  the 
operation  carried  out  as  part  of  the  event  (e.g.,  a 
file  execution)  by  a  user  of  the  network  succeeded 
or  failed.  However,  as  there  are  inherent  uncertain¬ 
ties  present  in  any  scenario,  one  could  think  of 
other  attributes  that  might  characterize  the  simple 
scenario.  Thus,  we  admit  that  the  list  of  these  four 
attributes  might  not  be  exhaustive  and  open  to 
inclusion  of  other  attributes  or  a  different  set  of 
attributes.  However,  for  the  purpose  of  analysis 
in  this  chapter,  we  assume  the  above  described 
four  attributes  to  characterize  the  simple  scenario. 

In  the  IBL  model  of  the  security  analyst,  an 
instance’s  S  slots  refers  to  the  situation  attributes 
defined  above;  the  D  slot  refers  to  the  decision,  i.e., 
whether  to  classify  a  sequence  of  events  as  consti¬ 
tuting  a  cyber-attack  or  not;  and,  the  U  slot  refers 
to  the  accuracy  of  the  classification  of  an  situation 
as  a  threat.  IBLT  proposes  five  mental  phases  in 
a  closed-loop  decision  making  process:  recogni¬ 
tion,  judgment,  choice,  execution,  and  feedback 
(Figure  2).  The  five  decision  phases  represent  a 
complete  learning  cycle  where  the  theory  explains 
how  knowledge  is  acquired,  reused,  and  learnt 
by  human  decision-makers.  Because  the  focus  of 
this  study  is  on  the  recognition  and  comprehen¬ 
sion  process  in  the  SA  of  a  security  analyst,  we 
will  only  focus  on  and  discuss  the  recognition, 
judgment,  choice,  and  execution  phases  in  the 
IBLT  (for  details  on  the  feedback  phase  refer  to 
Gonzalez  and  Dutt  (2010);  and  Gonzalez,  Lerch, 
and  Lebiere  (2003)).  In  addition  to  the  IBLT’s 
decision-making  process,  IBLT  borrowed  some 
of  the  proposed  statistical-learning  mechanisms 
from  apopular  cognitive  architecture  called  ACT- 
R  (Anderson  &  Lebiere,  1998, 2003).  Thus,  most 
of  the  previous  cognitive  models  that  have  used 
IBLT  were  developed  for  the  ACT-R  architecture. 

The  IBLT’s  process  starts  in  the  recognition 
phase  in  search  for  alternatives  and  classifies  the 
current  situation  as  typical  or  atypical.  The  current 
situation  is  typical  if  there  are  memories  of  simi¬ 
lar  situations  (i.e.,  instances  of  previous  trials  that 
are  similar  enough  to  the  current  situation).  If  the 
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Figure  2.  The  five  phases  of  IBL  theory  (right) 
and  an  environment,  i.e.,  a  decision  task  with 
which  a  model  developed  according  to  the  IBLT 
interacts  (left). 


situation  is  typical,  then  the  most  similar  instance 
is  retrieved  from  memory  in  the  judgment  phase 
and  is  used  to  determine  the  expected  utility  of 
the  situation  being  evaluated.  In  the  IBL  model, 
the  decision  alternatives  refer  to  whether  a  se¬ 
quence  of  events  constitutes  a  cyber-attack  or  not. 
The  actual  determination  of  the  utility  is  based 
upon  the  value  in  the  utility  slot  of  an  instance 
retrieved  from  memory.  The  decision  to  retrieve 
an  instance  from  memory  for  a  situation  event  is 
based  upon  a  comparison  of  the  instance’s  mem¬ 
ory  strength,  called  activation.  Thus,  an  instance 
is  retrieved  from  memory  if  the  instance  has  the 
highest  activation  among  all  instances  in  memo¬ 
ry. 

If  the  situation  event  in  the  network  is  atypical, 
then  a  judgment  heuristic  rule  is  applied  to  deter¬ 
mine  the  utility  of  a  new  instance  corresponding  to 
a  decision  alternative.  In  the  IBL  model,  we  pre¬ 
populate  the  memory  of  a  simulated  analyst  with 
certain  instances  to  start  with.  These  are  assumed 
to  be  pre-stored  experiences  of  past  situations  in 
the  analyst’s  memory,  and  thus  all  situation  events 
are  treated  by  the  model  as  typical. 


N ext,  in  the  choice  phase ,  a  decision  alternative 
is  selected  based  upon  the  utility  determined  in  the 
judgment  phase  (above).  Thus,  the  choice  phase 
in  the  IBL  model  consists  of  whether  to  classify 
a  set  of  network  events  seen  up  to  the  scenario’s 
current  event  as  constituting  a  cyber-attack,  or 
whether  to  accumulate  more  evidence  by  further 
observing  incoming  situation  events  before  such  a 
classification  could  be  made.  According  to  IBLT, 
this  decision  is  determined  in  the  “necessity  level,” 
which  represents  a  satisficing  mechanism  to  stop 
search  of  the  environment  and  be  “satisfied”  with 
the  current  evidence  (e.g.,  the  satisficing  strategy, 
Simon  &  March,  1958).  We  will  call  this  parameter 
in  the  model,  the  “risk-tolerance  level”  (a  free 
parameter)  to  represent  the  number  of  events  the 
model  has  to  classify  as  threats  before  the  model 
classifies  the  scenario  as  a  cyber-attack.  For  the 
risk-tolerance  level,  each  time  the  model  classifies 
a  situation  event  in  the  network  as  a  threat  (based 
upon  retrieval  of  an  instance  from  memory),  a 
counter  increments  and  signifies  an  accumula¬ 
tion  of  evidence  in  favor  of  a  cyber-attack.  If  the 
value  of  the  accumulated  evidence  (represented 
by  the  counter)  becomes  equal  to  the  analyst’s 
risk-tolerance  level,  the  analyst  will  classify  the 
scenario  as  a  cyber-attack  based  upon  the  sequence 
of  already  observed  network  events;  otherwise, 
the  model  will  decide  to  continue  obtaining  more 
information  from  the  environment  and  observe  the 
next  situation  event  in  the  network.  We  manipulate 
the  risk-tolerance  parameter  in  this  study  at  dif¬ 
ferent  number  of  events:  2,  4,  or  6  (more  details 
ahead).  Regardless,  the  main  outcome  of  the 
choice  phase  in  the  model  is  whether  to  classify 
a  set  of  network  events  as  a  cyber-attack  or  not. 

The  model’s  choice  phase  is  also  based  upon 
a  property  of  the  analyst  to  exhibit  “inertia,”  i.e., 
simply  not  to  decide  to  classify  a  sequence  of 
observed  network  events  as  a  cyber-attack  due 
to  lack  of  attention  and  continue  to  wait  for  the 
next  situation  event.  The  inertia  in  the  model  is 
governed  by  a  free  parameter  called  probability 
of  inertia  (Pinertia)  (Gonzalez  &  Dutt,  2010; 
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Gonzalez,  Dutt,  &  Lejarraja,  2011).  If  the  value 
of  a  random  number  derived  from  a  uniform  dis¬ 
tribution  between  [0,  1  ]  is  less  than  Pinertia,  the 
model  will  choose  to  observe  another  network 
event  in  the  scenario  and  will  not  classify  the 
sequence  of  already  observed  events  as  a  cyber¬ 
attack;  otherwise,  the  model  will  make  a  decision 
to  classify  the  observed  events  based  upon  the  set 
risk-tolerance  level.  We  assumed  a  default  value 
of  Pinertia  at  0.3  (or  30%). 

The  choice  phase  is  followed  by  the  execution 
of  the  best  decision  alternative.  The  execution 
phase  for  the  IBL  model  means  either  to  classify 
a  sequence  of  observed  events  as  a  cyber-attack 
and  stop  online  operations  in  the  company,  or  not 
to  classify  the  sequence  of  events  as  a  cyber-attack 
and  to  let  the  online  operations  of  the  company 
continue  undisrupted. 

In  IBLT,  the  activation  of  an  instance  i  in 
memory  is  defined  using  the  ACT-R  architecture’s 
activation  equation: 

a=b.+y:p,xm,+.  0) 

1=1 


where,  i  refers  to  the  ith  instance  that  is  pre-popu- 
lated  in  memory  where  i  =  1 ,2, . . .,  Total  number  of 
pre-populated  instances;  and,  B.  is  the  base-level 
learning  parameter  and  reflects  the  recency  and 
frequency  of  the  use  of  the  /,h  instance  since  the 
time  it  was  created,  which  is  given  by: 


B  =  In 


E  (*-*.) 

Mi--*-1} 


(2) 


The  frequency  effect  is  provided  by  t—1, 
the  number  of  retrievals  of  the  i'h  instance  from 
memory  in  the  past.  The  recency  effect  is  provided 
by  t—tp  the  event  since  the  t"'  past  retrieval  of  the 
i"‘  instance  (in  Equation  2,  t  denotes  the  current 
event  number  in  the  scenario).  The  d  is  the  decay 


parameter  and  has  a  default  value  of  0.5  in  the 
ACT-R  architecture,  and  it  is  the  value  we  assume 
for  the  IBL  model  of  the  security  analyst. 

k 

The  y  i]  x  Mk  term  is  the  similarity  compo- 

1=1 

nent  and  represents  the  mismatch  between  a  situ¬ 
ation  event’s  attributes  and  the  situation  (S)  slots 
of  an  instance  i  in  memory.  And  k  is  the  total 
number  of  a  situation’s  attributes  that  are  used  to 
retrieve  the  instance  i  from  memory.  In  the  IBL 
model,  the  value  of  k  =  4,  as  in  the  simple  sce¬ 
nario,  there  are  4  attributes  that  characterize  a 
situation  event  in  the  network  and  that  are  also 
used  to  retrieve  instances  from  memory.  As  men¬ 
tioned  above,  these  attributes  are  IP,  director y, 
alert,  and  operation  in  an  event.  The  match  scale 
(P)  reflects  the  amount  of  weighting  given  to  the 
similarity  between  an  instance  z’s  situation  slot  / 
and  the  corresponding  situation  event’s  attribute. 
P  is  generally  a  negative  integer  with  a  common 
value  of  - 1 .0  for  all  situation  slots  k  of  an  instance 
i.  The  M  or  match  similarities  represents  the 
similarity  between  the  value  /  of  a  situation  event’s 
attribute  that  is  used  to  retrieve  instances  from 
memory  and  the  value  in  the  corresponding  situ¬ 
ation  slots  of  an  instance  i  in  memory.  In  this 

k 

chapter,  the  yj]  x  iff,,  term  has  been  defined  by 

i=i 

using  both  a  squared-geometric  similarity  model 
and  a  feature-based  similarity  model  (Shepard, 
1962a,  1962b;  Tversky,  1977).  In  the  squared- 

k 

geometric  model  the  yj]  x  Mu  is  defined  as: 

i=i 

EfJxM1.*E-lx('.-,~>!  <3) 

1=1  1=1 


However,  in  the  feature-based  similarity 

k 

model,  the  y^P,  x  Mu  is  defined  as: 

i=i 
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k 

Y1P‘  xMu=0x  /(*  n  event )  (4) 

1=1  '  ' 

—a  x  f(i  —  event)  —  (3  x  / (event  —  i ) 

The  similarity  of  instance  z  to  situation  event  is 
expressed  as  a  linear  combination  of  the  measure 
of  the  common  and  distinctive  features.  The  term 
/(/  fl  event)  represents  the  number  of  features  that 
the  four  slots  of  instance  i  and  the  four  attributes 
in  a  situation  event  have  in  common.  The  term 
J[i~event)  represents  the  features  in  the  instance 
z’s  four  slots  that  are  missing  from  the  four  attri¬ 
butes  in  the  situation  event.  The  term  J[event~i) 
represents  the  features  of  the  four  attributes  in  the 
situation  event  that  are  missing  from  the  instance 
i ’s  four  slots.  Furthermore,  0,  a,  and  P  are  weights 
for  the  common  and  distinctive  components.  We 
assumed  default  values  of  the  weights  and  thus, 
0=2,  a  =1,  and  P  =1.  The  default  value  assump¬ 
tion  is  because  it  balances  out  the  effects  of  the 
common  features  ( 1 st  term  in  Equation  4)  and  the 
uncommon  features  (2nd  and  3rd  terms  in  Equation 
4).  Thus,  the  default  assumption  is  a  safe  assump¬ 
tion  to  make  both  from  literature  (Tversky,  1 977) 
and  because  we  make  predictions  about  the  work¬ 
ing  of  an  analyst  where  we  don’t  know  about  the 
real  behavior  of  an  analyst. 

k 

In  order  to  find  the  value  of  the  x  Mu 

i=i 

term,  the  situation  events’  attributes  and  the  val¬ 
ues  in  the  corresponding  slots  of  instances  in 
memory  were  coded  using  numeric  codes.  Table 
1  shows  the  codes  assigned  to  the  SDU  slots  of 
instances  in  memory  and  the  situation  events’ 
attributes  in  the  simple  scenario.  The  assumption 
of  on  these  codes  is  made  to  yield  a  nontrivial 
contribution  of  the  similarity  term  in  the  activation 
equation  (Equation  1). 

k 

Due  to  the  x  Mu  specification,  instanc- 

z=i 

es  that  encode  a  similar  situation  to  the  current 
situation  event’s  attributes,  receive  a  less  negative 


activation  (in  Equation  1).  In  contrast,  instances 
that  encode  a  dissimilar  situation  to  the  current 
situation  event’s  attributes  receive  a  more  negative 
activation. 

Furthermore  is  the  noise  value  that  is  com- 

r 

puted  and  added  to  an  instance  z’s  activation  at 
the  time  of  its  retrieval  attempt  from  memory. 
The  noise  value  is  characterized  by  a  parameter 
s.  The  noise  is  defined  as, 


.  =  s  x  In 
1 


(5) 


where,  q .  is  a  random  draw  from  a  uniform  distribu¬ 
tion  bounded  in  [0, 1  ]  for  an  instance  z  in  memory. 
We  set  the  parameter  5  in  an  IBL  model  to  make 
it  a  part  of  the  activation  equation  (Equation  1). 
The  s  parameter  has  a  default  value  of  0.25  in  the 
ACT-R  architecture  and  we  assume  the  default 
value  of  s  in  the  IBL  model  of  the  security  analyst. 


IMPLEMENTATION  AND 
EXECUTION  OF  THE  IBL  MODEL 

The  IBL  model  of  the  security  analyst  was  cre¬ 
ated  using  Matlab  software.  The  IBL  model 
goes  over  a  sequence  of  25  network  events  in 
the  simple  scenario  (Figure  1).  The  memory  of  a 
simulated  analyst  in  the  model  was  pre-populated 
with  instances  encoding  all  possible  sequences 
of  network  events  based  upon  values  of  events’ 
attributes.  Some  of  these  instances  contained  a 
threat  value  as  the  utility  and  some  did  not  (more 
information  below).  Unbeknownst  to  the  model 
(but  known  to  the  modeler),  out  of  the  25  events 
in  the  scenario  (mentioned  above),  there  are  8 
pre-defmed  threat  events  that  are  executed  by  an 
attacker  outside  the  company  (Ou  et  al.,  2006; 
Xie  et  al.  2010).  For  each  event  in  the  scenario, 
the  IBL  model  uses  Equations  1,2,  {3  or  4},  and 
5  to  retrieve  an  instance  that  is  most  similar  to 
the  encountered  event.  Based  upon  the  value  of 
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Table  1.  The  coded  values  in  the  slots  of  an  instance 
in  memory  and  attributes  of  a  situation  event 


Attributes 

Values 

Codes 

IP  (S) 

Webserver 

1 

Fileserver 

2 

Workstation 

3 

Directory  (S) 

Missing  value 

-100 

File  X 

1 

Alert  (S) 

Present 

1 

Absent 

0 

Operation  (S) 

Successful 

1 

Unsuccessful 

0 

Decision  (D) 

Cyber-attack 

1 

No  Cyber-attack 

0 

Threat  (U) 

Yes 

1 

No 

0 

the  utility  slot  of  a  retrieved  instance,  the  situa¬ 
tion  event  is  classified  as  a  threat  or  not  a  threat. 
Depending  upon  the  inertia  mechanism  and  the 
risk-tolerance  level  of  a  simulated  analyst  in  the 
model,  a  decision  is  made  to  classify  a  sequence 
of  observed  events  as  a  cyber-attack  and  stop 
company’s  online  operations,  or  to  let  the  company 
continue  its  online  operations  (no  cyber-attack). 

The  IBL  model  was  executed  for  a  set  of  500 
simulated  analysts  on  the  same  simple  scenario 
where  each  simulated  analyst  encountered  25  or 
less  situation  events  in  the  network.  For  each  of 
the  500  simulated  analysts,  we  manipulated  the 
mix  of  threat  and  non-threat  instances  in  memory, 
i.e.,  experience  of  the  analyst,  the  risk-tolerance 
level  of  the  analyst,  and  the  similarity  model  used 
by  the  analyst. 

The  mix  of  threat  and  non-threat  instances  in 
the  model’s  memory  could  be  one  of  the  follow¬ 
ing  three  kinds:  ambivalent  analyst  (Ambi):  50% 
of  threat  instances  and  50%  non- threat  instances 
for  each  situation  event  in  the  scenario;  an  extra¬ 
careful  analyst  (Extra):  75%  of  threat  instances 
and  25%  of  non- threat  instances  for  each  situation 
event  in  the  scenario;  and  a  less-careful  analyst 


(Less):  25%  of  threat  instances  and  75%  of  non¬ 
threat  instances  for  each  situation  event  in  the 
scenario.  The  risk-tolerance  level  of  analyst  was 
manipulated  on  the  following  three  levels:  low 
(2  events  out  of  a  possible  25  event  need  to  be 
classified  as  threats  before  the  analyst  classifies 
a  sequence  of  observed  events  as  cyber-attack); 
medium  (4  events  out  of  a  possible  25  event  to  be 
classified  as  threats  before  the  analyst  classifies 
a  sequence  of  observed  events  as  cyber-attack); 
and  high  (6  events  out  of  a  possible  25  event  to  be 
classified  as  threats  before  the  analyst  classifies 
a  sequence  of  observed  events  as  cyber-attack). 
Please  note  that  the  values  of  2,  4,  and  6  events 
for  the  risk-tolerance  is  a  reasonable  and  balanced 
manipulation  given  that  there  are  only  8  total 
threat  events  (whose  threat  identity  is  unknown 
to  the  model)  in  the  scenario.  Finally,  the  simi¬ 
larity  model  was  manipulated  at  two  levels  and 
could  be  either  squared-geometric  (Equation  3) 
or  feature-based  (Equation  4). 

We  wanted  to  derive  predictions  of  the  effect 
of  the  above  manipulations  in  the  model  upon  the 
cyber-SA  of  the  analyst.  The  cyber-SA  of  a  simu¬ 
lated  analyst  was  measured  using  the  accuracy  and 
timeliness  of  the  analyst.  The  accuracy  was  evalu¬ 
ated  using  two  different  cyber-SA  metrics,  recall 
and  precision,  and  the  timeliness  was  evaluated 
in  the  model  using  a  single  timeliness  cyber-SA 
metric  (Jajodia  et  al.,  2010).  Recall  is  the  percent 
of  events  correctly  detected  as  threats  out  of  the 
total  number  of  known  threat  events  observed 
by  the  model  before  the  model  stopped  in  the 
scenario  (Recall  is  the  same  as  hit  rate  in  Signal 
Detection  Theory;  Jajodia  et  al.  ,2010).  Precision 
is  the  percentage  of  events  correctly  detected  as 
threats  out  of  the  total  number  of  threat  events 
detected  by  the  model  before  it  stopped  in  the 
scenario.  Timeliness  is  100%  minus  percentage 
of  events,  out  of  a  total  25,  after  which  the  model 
stops  in  the  scenario  and  classifies  the  scenario  to 
be  a  cyber-attack  (the  timeliness  could  be  defined 
as  the  number  of  events  out  of  25,  but  defining  it 
as  a  percentage  allows  us  to  compare  it  to  other 
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two  cyber-SA  measures).  A  point  to  note  is  that  if 
the  model  is  unable  to  stop  before  the  25  events 
elapse  in  the  scenario,  then  the  denominators  of 
the  above  cyber-SA  metrics  equal  25. 

For  both  similarity  models,  we  expected  best 
performance  for  the  IBL  model  representing  an 
extra-careful  analyst  with  a  low  risk-tolerance, 
and  the  worst  performance  for  the  IBL  model 
representing  a  less-careful  analyst  with  a  high 
risk-tolerance.  This  fact  is  because  an  extra-careful 
analyst  with  a  low  risk-tolerance  will  be  classify¬ 
ing  network  events  more  cautiously  compared  to 
a  less-careful  analyst  with  a  high  risk-tolerance. 
Also,  as  both  similarity  models,  squared-geometric 
and  feature-based,  aim  to  search  for  the  most  simi¬ 
lar  instance  in  memory  to  a  situation  event  in  the 
simple  scenario,  we  expect  a  similar  performance 
in  the  IBL  model  for  both  similarity  models. 

RESULTS 

Figure  3  shows  the  predictions  of  the  cyber-SA 
measures  (recall,  precision,  and  timeliness)  of  an 
average  security  analyst  from  the  IBL  model  due 
to  the  effects  of  manipulating  the  memory,  risk- 
tolerance,  and  the  similarity  model  used.  First, 
for  both  similarity  models,  the  effect  of  memory 
manipulation  on  cyber-SAmeasures  (panel  A  and 
D)  was  stronger  compared  to  the  risk-tolerance 
measure  (panel  B  and  E).  Thus,  although  there 
was  a  pronounced  change  in  the  three  measures, 
recall,  precision,  and  timeliness,  as  a  result  of  the 
memory  manipulation  (Less,  Ambi,  and  Extra), 
the  change  in  the  three  measures  was  little  due  to 
the  risk-tolerance  manipulation  (High,  Medium, 
and  Low).  Furthermore,  as  per  our  expectation  for 
both  similarity  models,  an  extra-careful  analyst 
with  a  low  risk-tolerance  did  better  on  all  three 
performance  measures  compared  to  a  less-careful 
analyst  with  a  high  risk-tolerance  (panel  C  and  F). 
Also,  the  precision  was  higher  in  the  feature-based 
model  compared  to  that  in  the  squared-geometric 
model,  but  in  general,  the  precision  was  less  than 


the  recall  and  timeliness  in  different  manipula¬ 
tions.  This  latter  observation  is  due  to  the  fact  that 
a  model  that  has  a  greater  recall  and  timeliness 
need  not  have  a  greater  precision  simultaneously. 
That  is  because  it  is  not  necessary  that  a  model 
that  is  able  to  retrieve  more  threat  instances  from 
memory  and  rapidly,  is  able  to  retrieve  them  ac¬ 
curately  for  each  situation  event  in  the  scenario 
(thus,  there  are  chances  of  false-alarms). 

Figure  3.  The  effect  of  experience  (memory)  on 
cyber-SA  of  an  analyst  in  the  squared-geometric 
similarity  model  (A)  and  in  the  feature-based 
similarity  model  (D).  The  effect  of  risk-tolerance 
on  cyber-SA  of  an  analyst  in  the  squared-geometric 
similarity  model  (B)  and  in  the  feature-based  simi¬ 
larity  model  (E).  The  interaction  effect  of  memory 
and  risk-tolerance  on  cyber-SA  of  an  analyst  in 
the  squared-geometric  similarity  model  (C)  and  in 
the  feature-based  similarity  model  (F).  A  greater 
percentage  on  all  three  cyber-SAmeasures,  recall, 
precision,  and  timeliness  is  more  desirable  as  it 
makes  the  simulated  analyst  more  efficient. 

DISCUSSION 

In  this  chapter,  we  have  shown  thatcomputational 
models  based  on  the  IBLT  can  be  used  to  make 
predictions  of  a  security  analyst’s  cyber-SA  in  a 
cyber-attack  scenario.  Particularly,  the  model  can 
make  concrete  predictions  of  the  level  of  recall, 
precision,  and  timeliness  of  a  security  analyst  given 
some  level  of  analyst’s  experiences  about  network 
events  (in  memory),  analyst’s  risk-tolerance,  and 
the  model  that  an  analyst  uses  to  compute  similarity 
ofnetwork  events  with  experiences  in  his  memory. 

We  created  an  IBL  model  of  the  security  analyst 
for  a  simple  scenario  of  a  typical  island-hopping 
cyber-attack.  The  island-hopping  attackportrayed 
in  the  simple  scenario  is  one  of  the  most  common 
methods  of  cyber-attack  in  the  real  world  (Ou 
et  al.,  2006;  Xie  et  al.,  2010).  Then,  using  the 
simple  scenario,  we  evaluated  the  performance 
of  a  simulated  analyst  on  three  commonly  used 
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measures  of  cyber-SA.  These  measures  are  based 
upon  accuracy  of  analyst  (precision  and  recall) 
and  the  timeliness  of  the  analyst  to  react  to  cyber¬ 
attacks  (timeliness).  Our  results  revealed  that 
both  the  risk-tolerance  level  of  an  analyst  and 
the  mix  of  experiences  of  threat  and  non-threat 
instances  in  analyst’s  memory  affect  the  analyst’s 
cyber-SA;  with  the  effect  of  the  analyst’s  experi¬ 
ences  (in  memory)  more  impacting  compared  to 
risk-tolerance.  One  reason  for  the  lesser  impact 
of  the  risk-tolerance  manipulation  could  be  due 
to  the  nature  and  working  of  IBL  models  that  are 
strongly  dependent  upon  retrieval  of  instances 
from  memory  to  make  choice  decisions.  Another 
reason  couldbe  the  presence  of  inertia  in  the  model, 
which  drives  the  model  to  observe  more  network 
events  before  the  model  could  make  a  stop  deci¬ 
sion  and  where  the  risk-tolerance  will  only  come 
to  play  a  role  in  the  model  if  the  probability  of 
inertia  (set  at  30%)  is  exceeded. 

Also,  when  the  simulated  analyst  is  less  careful, 
then  for  any  situation  event  the  model  has  only 
a  25%  chance  of  retrieving  threat  instances  from 
memory  and  a  75%  chance  of  it  retrieving  non¬ 
threat  instances  from  memory.  As  a  consequence, 
the  model  has  a  lesser  chance  to  classify  actual 
threat  events  in  the  simple  scenario  as  threats. 
Furthermore,  it  takes  more  time  for  the  model  to 
accumulate  evidence  that  equals  the  risk- tolerance 
level  that  causes  the  model  to  make  a  decision  in 
favor  of  a  cyber-attack  and  stop  work  (decreas¬ 
ing  the  timeliness).  However,  when  the  simulated 
analyst  is  more  careful,  then  for  any  situation  event 
there  is  a  75%  chance  of  the  model  retrieving 
threat  instances  and  25%  chance  of  it  retrieving 
non-threat  instances.  As  a  consequence,  the  model 
has  a  greater  chance  to  classify  actual  threats  in 
the  simple  scenario  as  threats  and  also  takes  less 
time  to  accumulate  evidence  that  is  equal  to  the 
risk-tolerance  level  (increasing  the  Timeliness). 

The  most  important  aspect  of  the  model  is 
the  fact  that  although  the  recall  and  timeliness 


increase  as  a  direct  function  of  the  model’s  abil¬ 
ity  to  retrieve  threat  instances  from  the  memory 
and  its  risk-tolerance,  there  is  not  a  substantial 
increase  in  its  precision  when  either  of  the  two 
manipulations  (memory  and  risk-tolerance)  is 
favorable  (Figure  3).  The  slow  increase  in  preci¬ 
sion  is  expected  because  a  model  that  is  able  to 
retrieve  more  threat  instances  from  memory  and 
is  less  risk-tolerant  might  not  necessarily  be  more 
precise  in  its  actions.  However,  there  is  still  an 
increase  in  precision  with  a  manipulation  of  both 
memory  and  risk-tolerance  and  this  suggests  that 
making  a  security  analyst  less  risk-tolerant  as 
well  as  extra-careful  might  help  increase  his  job 
efficiency.  Because  the  IBL  model  is  a  process 
model  that  observes  events,  and  makes  decisions 
by  retrieving  experiences  from  memory,  these 
are  only  some  of  the  many  predictions  that  the 
IBL  model  can  make  regarding  the  cyber-SA  of 
human  analysts. 

Furthermore,  although  the  current  model  is 
able  to  make  a  priori  predictions,  these  need  to 
be  actually  validated  with  human  data.  We  plan 
to  run  laboratory  studies  in  the  near  future  to  as¬ 
sess  human  behavior  in  this  simple  scenario.  An 
experimental  approach  will  allow  us  to  validate 
our  model’s  predictions  and  improve  the  relevance 
of  the  model  and  the  assumptions  made  in  it  on 
its  free  parameters.  In  these  experimental  studies, 
we  believe  that  some  of  the  interesting  factors  to 
manipulate  would  include  the  experiences  of  the 
human  analyst  (stored  in  memory).  One  method 
we  are  currently  considering  is  to  make  partici¬ 
pants  read  or  watch  examples  of  more  and  less 
threatening  scenarios  before  they  participate  in 
the  act  of  detecting  cyber-attacks  in  the  simple 
scenario  (i.e.,  priming  the  memory  of  the  model 
with  more  or  less  threat  instances  as  we  did  in 
the  IBL  model).  Also,  we  plan  to  record  the  risk¬ 
seeking  and  risk-averse  behavior  of  participants 
using  popular  measures  involving  gambles  to 
control  for  the  risk-tolerance  factor  (typically  a 
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risk-seekingperson  is  more  risk-tolerant  compared 
to  a  risk-averse  person).  Also,  once  we  calibrate 
the  current  predictions  of  the  model  with  an  em¬ 
pirical  study  data  (that  we  plan  to  collect  in  the 
future),  we  can  evaluate  the  efficacy  of  different 
similarity  models.  Thus,  our  next  goal  will  be 
to  validate  the  predictions  from  the  IBL  model. 

IMPLICATIONS  FOR  TRAINING 
AND  DECISION  SUPPORT 
OF  SECURITY  ANALYSTS 

If  our  model  is  able  to  represent  the  cyber- SA  of 
human  analysts  accurately,  this  model  would  have 
significant  potential  to  contribute  towards  the 
design  of  training  and  decision  support  tools  for 
security  analysts.  Based  upon  our  current  predic¬ 
tions,  it  might  be  better  to  devise  analyst  training 
and  decision  support  that  primes  them  to  have 
experienced  more  threat  rather  than  non-threat 
network  events.  The  analyst’s  cyber-SAis  also  im¬ 
pacted  by  how  tolerant  he/she  is  to  cyber-attacks. 
Thus,  companies  recruiting  security  analysts  for 
network  monitoring  operations  could  measure 
the  risk-seeking/risk-aversion  character  of  a  po¬ 
tential  analyst  (by  using  different  risk-orientation 
measures  that  use  gambles).  Doing  so  would  help 
evaluate  a  humans  fit  for  the  security  analyst’s 
position.  Furthermore,  although  risk-orientation 
is  a  characteristic  of  a  person  (like  his  personal¬ 
ity)  that  comes  about  as  a  result  of  his  day-to-day 
experience  and  education,  but  there  might  be 
training  interventions  that  could  make  analysts 
conscious  of  their  risk- orientation  or  alter  their 
risk-orientation.  Based  upon  our  results,  making 
analysts  less  risk-tolerant  (or  more  risk-averse) 
would  help  in  increasing  their  efficiency  in  their 
job.  Finally,  based  upon  our  results,  training  se¬ 
curity  analysts  about  the  similar  and  dissimilar 
features  between  threats  and  non-threats  in  dif¬ 
ferent  cyber-attacks  will  benefit  making  analyst 
more  precise  on  their  job. 


CONCLUSION 

Due  to  the  growing  threat  to  our  cyber  infra¬ 
structure  and  the  heightened  need  to  implement 
cybersecurity,  itbecomes  importantto  evaluate  the 
cyber  situation  awareness  (cyber-SA)  of  security 
analysts  in  different  cyber-attack  scenarios.  In  this 
chapter,  we  suggest  a  memory-based  account, 
based  upon  instance-based  learning  theory,  of 
the  decisions  of  a  security  analyst  who  is  put  in  a 
popular  cyber-attack  scenario  of  an  island-hopping 
attack.  Our  results  indicate  that  the  cyber-SA  of 
an  analyst  is  a  function  of  his  memory  of  threat 
and  non-threat  events,  his  risk-tolerance,  and 
the  similarity  methods  he  uses  to  compare  net¬ 
work  events  to  prior  experiences  of  events  in  his 
memory.  Based  upon  our  predictions,  it  might  be 
helpful  to  devise  analyst  job  training  that  makes 
analysts  cautious  about  the  possibility  of  cyber 
threats,  less  risk-tolerant,  and  that  enable  them  to 
look  for  features  in  attributes  ofnetwork  events  that 
communicate  the  indication  of  potential  threats. 
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KEY  TERMS  AND  DEFINITIONS 

Cyber-Attack:  Also  known  as  cyber-warfare 
and  is  the  use  of  computers  and  the  Internet  in 
conducting  warfare  in  cyberspace. 

Cyber-Situation  Awareness:  When  a  security 
accident  occurs,  the  top  three  questions  security 
administrators  would  ask  are  in  essence:  What 
has  happened?  Why  did  it  happen?  What  should 
I  do?  Answers  to  these  questions  form  the  “core” 
of  Cyber  Situational  Awareness. 

Dynamic  Decision-Making:  The  interde¬ 
pendent  decision  making  that  takes  place  in  an 
environment  that  changes  over  time  either  due 
to  the  previous  actions  of  the  decision  maker,  or 
due  to  events  that  are  outside  of  the  control  of  the 
decision  maker. 

Instance-Based  Learning  Theory:  A  theory 
of  how  humans  make  decisions  in  dynamic  tasks. 
According  to  the  theory,  individuals  rely  on  their 
accumulated  experience  to  make  decisions  by 
retrieving  past  solutions  to  similar  situations 


stored  in  memory.  Thus,  decision  accuracy  can 
only  improve  gradually  and  through  interaction 
with  similar  situations. 

Intrusion-Detection  System:  A  device  or 
software  application  that  monitors  network  and / 
or  system  activities  for  malicious  activities  or 
policy  violations  and  produces  reports  to  a  se¬ 
curity  analyst. 

Network  Events:  Events  that  take  place 
over  a  network  like  opening  of  a  file  by  a  user 
on  a  workstation  that  resides  on  a  remote  server. 
These  events  could  be  further  classified  as  threats 
(executed  by  a  cyber-attacker)  or  non-threats  (ex¬ 
ecuted  by  a  normal  user  of  the  network  without 
any  malicious  intensions). 

Security  Analyst:  A  decision-maker  who  is 
in  charge  of  observing  the  online  operations  of 
a  coiporate  network  (e.g.,  an  online  retail  com¬ 
pany  with  an  external  Webserver  and  an  internal 
fileserver)  from  threats  of  random  or  organized 
cyber-attacks. 
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